Backfilling with lookahead to optimize the packing of parallel jobs

نویسندگان

  • Edi Shmueli
  • Dror G. Feitelson
چکیده

The utilization of parallel computers depends on how jobs are packed together: if the jobs are not packed tightly, resources are lost due to fragmentation. The problem is that the goal of high utilization may conflict with goals of fairness or even progress for all jobs. The common solution is to use backfilling, which combines a reservation for the first job in the interest of progress with packing of later jobs to fill in holes and increase utilization. However, backfilling considers the queued jobs one at a time, and thus might miss better packing opportunities. We propose the use of dynamic programming to find the best packing possible given the current composition of the queue, thus maximizing the utilization on every scheduling step. Simulations of this algorithm, called LOS (Lookahead Optimizing Scheduler), using trace files from several IBM SP parallel systems, show that LOS indeed improves utilization, and thereby reduces the mean response time and mean slowdown of all jobs. Moreover, it is actually possible to limit the lookahead depth to about 50 jobs and still achieve essentially the same results. Finally, we experimented with selecting among alternative sets of jobs that achieve the same utilization. Surprising results indicate that choosing the set at the head of the queue does not necessarily guarantee best performance. Instead, repeatedly selecting the set with the maximal overall expected slowdown boosts performance when compared to all other alternatives checked.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Job Scheduling in Cloud with Lookahead and Workload Consolidation

The cloud computing paradigm enables consumers to run their applications in remote data centers. Many of these applications may be complex which requires parallel processing capabilities. Parallel job scheduling techniques mainly focus on improving responsiveness and utilization. For a data center that deals with parallel jobs, it is important to devise an optimal schedule which results in maxi...

متن کامل

Parallel Job Scheduling under Dynamic Workloads

Jobs that run on parallel systems that use gang scheduling for multiprogramming may interact with each other in various ways. These interactions are affected by system parameters such as the level of multiprogramming and the scheduling time quantum. A careful evaluation is therefore required in order to find parameter values that lead to optimal performance. We perform a detailed performance ev...

متن کامل

Self-Adapting Backfilling Scheduling for Parallel Systems

We focus on non-FCFS job scheduling policies for parallel systems that allow jobs to backfill, i.e., to move ahead in the queue, given that they do not delay certain previously submitted jobs. Consistent with commercial schedulers that maintain multiple queues where jobs are assigned according to the user-estimated duration, we propose a self-adapting backfilling policy that maintains multiple ...

متن کامل

Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling

ÐScheduling jobs on the IBM SP2 system and many other distributed-memory MPPs is usually done by giving each job a partition of the machine for its exclusive use. Allocating such partitions in the order in which the jobs arrive (FCFS scheduling) is fair and predictable, but suffers from severe fragmentation, leading to low utilization. This situation led to the development of the EASY scheduler...

متن کامل

Backfilling with Guarantees Granted upon Job Submission

In this paper, we present scheduling algorithms that simultaneously support guaranteed starting times and favor jobs with systemdesired traits. To achieve the first of these goals, our algorithms keep a profile with potential starting times for every unfinished job and never move these starting times later, just as in Conservative Backfilling. To achieve the second, they exploit previously unre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Parallel Distrib. Comput.

دوره 65  شماره 

صفحات  -

تاریخ انتشار 2005